Skip to content

[pull] master from DataDog:master#544

Merged
pull[bot] merged 4 commits into
ConnectionMaster:masterfrom
DataDog:master
May 18, 2026
Merged

[pull] master from DataDog:master#544
pull[bot] merged 4 commits into
ConnectionMaster:masterfrom
DataDog:master

Conversation

@pull

@pull pull Bot commented May 18, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

nbeckstead-ddog and others added 4 commits May 18, 2026 17:01
* Set device.os.type to Linux on all OCSF events

Add a schema-category-mapper that sets ocsf.device.os.type and
ocsf.device.os.type_id (200/Linux) on every event the integration
emits. This gives downstream rules a stable, source-agnostic way to
filter for Linux events — e.g. cross-source detection rules can use
@ocsf.device.os.type:Linux to scope to Linux endpoints without
depending on metadata.product.vendor_name (which encodes the source,
not the OS).

Also add profiles: [host] to the SOCKADDR Network Activity
sub-pipeline so device.os.* validates against the OCSF schema there
(Network Activity has no native device attribute).

Test fixtures updated to expect device.os in all 29 ocsf: blocks, and
the SOCKADDR fixture to expect metadata.profiles: [host].

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix OCSF validation: device.os.name, base_event host profile, device.type_id

Address validator errors after adding device.os.type to all sub-pipelines:

- ocsf.device.os.name is required by the OCSF OS object; set it to
  "Linux" via a top-level string-builder in the OCSF pre-transformations
  sub-pipeline so it applies to every event.
- ocsf.device.name is required to satisfy the Device object's
  at_least_one constraint on sub-pipelines that don't otherwise
  populate hostname/ip; set to "Unknown" via the same top-level
  string-builder.
- Base Event class natively includes the host profile in OCSF;
  declaring profiles: [host] on the Base Event schema-processor
  brings the pipeline in line so device.* validates.
- ocsf.device.type_id was missing from Base Event, SOCKADDR Network
  Activity, and SYSCALL Network Activity sub-pipelines; added a
  schema-category-mapper (Unknown / 0) to each.

Test fixtures updated to reflect the new fields: device.os.name on all
29 fixtures, device.name on the 11 fixtures that previously lacked it
(9 IAM/Device-Config with hostname/ip, 2 Network 4001), device.type
and device.type_id on the 11 fixtures that previously lacked the full
device shape (9 Base Event, 2 Network 4001), and metadata.profiles:
[host] on the 9 Base Event fixtures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…3728)

* Avoid cleanup when cancel called while check running

* Add changelog

* check if cancelled before running jobs

* Add some debug lines for cancel flow
* [OCSF] Zeek/Corelight pipeline

Add OCSF v1.5.0 normalization for Zeek/Corelight logs, covering 7 log
types across 5 OCSF classes (Detection Finding, Network Activity, HTTP
Activity, DNS Activity, File Hosting Activity).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix validate-logs errors in zeek.yaml

Resolve 36 validation errors flagged by the datadog-assets validator:
- Add missing `overrideOnConflict: false` to 3 attribute-remappers
- Fix 2 schema-remapper names to backtick individual fields
- Rename 25 facets to match validator's canonical names and add
  `type: integer`/`facetType: range` where required
- Remove 6 facets with unresolvable path conflicts (validator demanded
  unique paths with no canonical definition available)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix severity mapping for Detection Finding [2004] Notice

Notice events emit `severity.name` capitalized ("High", "Medium", etc.),
so the lowercase `@severity.name:informational` filters never matched
and the fallback assigned `ocsf.severity_id: 99` while preserving the
capitalized name as `ocsf.severity`. Switch the schema-category-mapper
to filter on the numeric `severity.id` (1-5) which Corelight reliably
emits, and update the notice fixture's expected `severity_id` from 99
to 4 to reflect the corrected mapping.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add catch-all category to schema-category-mappers with fallback

Each schema-category-mapper that defines a fallback must also have a
catch-all filter category at the end matching the fallback's values.
Six mappers were missing the trailing catch-all: notice/alert
severity_id (2004), http activity_id/status_id (4002), dns rcode_id,
and dns status_id (4003). Append `query: "*"` -> Other/99 to each.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply PR review feedback for Zeek/Corelight OCSF pipeline

Direct mappings, dead-code removal, correctness fixes, and OCSF validator
cleanups across notice, suricata, conn, ssl, weird, http, dns, and file
hosting sub-pipelines:

- Map directly to OCSF targets where intermediates were unnecessary
  (ocsf.time, ocsf.duration, ocsf.traffic.packets, JA3/JA3S algorithm_id,
  weird protocol_name).
- Drop dead/auto-generated mappers: notice/suricata category_uid (set by
  schema-processor), self-maps of finding_info.uid, event_code, file.hashes
  (when unbuilt upstream), suricata community_id correlation_uid, HTTP
  version-as-protocol_ver, DNS direction derivation, and the DNS rcode_id
  catch-all/fallback (recommended-not-required).
- Convert suricata alert.signature_id event_code from string-builder to
  schema-remapper.
- Combine domain/query into single ocsf.query.hostname schema-remapper.
- Fix DNS Activity filters: use rcode_name presence to discriminate
  Response/Query instead of dns.answer.name (handles NXDOMAIN responses).
- DNS status_id catch-all renamed Other/99 -> Unknown/0 to satisfy the
  OCSF validator's suspicious-Other check.
- File Hosting tx_hosts/rx_hosts: drop the second intermediate field;
  grok targets ocsf.{src,dst}_endpoint.ip directly off a single stringify.
- Switch fallback source fields per Jonah's suggestions:
  severity -> severity.name, alert.severity -> alert_severity,
  http status -> status_msg, dns rcode/status -> rcode_name.
- Notice fixture: use id.orig_h/id.resp_h connection fields instead of
  the suricata-style src.

Regenerated zeek_tests.yaml with the OCSF validator (--check-all --write).
All 14 logs pass validation with no errors or warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Map Zeek DNS answers to ocsf.answers as dns_answer objects

Use two array-processors to wrap each Zeek `answers` string into a
dns_answer object and append to ocsf.answers: the first selects the
first array element into ocsf.answer.rdata, the second appends
ocsf.answer onto ocsf.answers. Only the first answer is captured (the
pipeline DSL has no per-element iteration), but that covers the common
single-A-record case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add catch-all for activity_id

* Fix validate-logs failure for DNS answers wrapper

The previous array-processor type:select required operation.filter and
operation.valueToExtract per the asset validator, but those only apply
to object arrays - Zeek's `answers` is a primitive string array. Switch
to string-builder + grok-parser to extract the first answer string into
ocsf.answer.rdata, then keep the array-processor append to wrap it into
ocsf.answers as a dns_answer object.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Address codex review feedback for file pipeline

- Include `files_red` in the File Hosting [6006] sub-pipeline filter so
  redacted file events get OCSF class_uid/activity_id/file fields, not
  just the pre-transform metadata.
- Prefer `filename` over `fuid` when populating `ocsf.file.name`; fall
  back to `fuid` only when `filename` is absent. The `fuid` mapping to
  `ocsf.file.uid` is unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Drop pipeline intermediates, fix multi-IP grok, restore file.hashes

- is_alert (notice 2004, suricata 2004): string-builder writes directly
  to `ocsf.is_alert`; grok-parser converts in place. Drops the
  `_is_alert_str` intermediate.
- DNS answers: stringify directly into `ocsf.answer`; grok extracts
  `ocsf.answer.rdata` via `a %{data:ocsf.answer.rdata}(,%{data})?` so
  the comma-separated multi-IP form parses correctly. Drops the
  `_answers_str` intermediate.
- File Hosting tx/rx hosts: stringify directly into
  `ocsf.{src,dst}_endpoint`; grok extracts `.ip` via
  `g %{ip:ocsf.{src,dst}_endpoint.ip}(,%{data})?` for multi-IP. Drops
  the `_tx_hosts_str`/`_rx_hosts_str` intermediates.
- Connection 4001: arithmetic-processor writes total bytes directly to
  `ocsf.traffic.bytes`; the schema-processor remapper becomes a
  self-map. Drops the `_total_bytes` intermediate (matches the
  earlier _total_packets/_duration_ms cleanup).
- Restore `ocsf.file.hashes`: build `tmp_md5`/`tmp_sha1`/`tmp_sha256`
  fingerprint objects (algorithm name, integer algorithm_id, value),
  array-processor append each into `ocsf.file.hashes`, and self-map
  the array inside the 6006 schema-processor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* Add OCSF DNS Activity normalization to coredns pipeline

Map CoreDNS query/response logs to OCSF DNS Activity [4003]. Adds OCSF
facets, a single-class sub-pipeline (no pre-transformation), and the
generated expected OCSF blocks in the test fixtures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Align coredns OCSF facet names with cloudflare and route53

validate-logs flagged five OCSF facet path conflicts. Rename to the
canonical form used by the existing DNS integrations and add the
`type: integer` annotation expected on `ocsf.rcode_id` and
`ocsf.src_endpoint.port`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add facetType range to ocsf.src_endpoint.port facet

validate-logs asks for `facetType: range` on this facet path. Match the
form CI's canonical-suggestion message printed for ocsf.src_endpoint.port.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* remove redundant fallbacks

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pull pull Bot locked and limited conversation to collaborators May 18, 2026
@pull pull Bot added the ⤵️ pull label May 18, 2026
@pull pull Bot merged commit 5ee80d2 into ConnectionMaster:master May 18, 2026
@pull pull Bot added the ⤵️ pull label May 18, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants